MERGE.JAM[UP,DOC] - www.SailDart.org

perm filename MERGE.JAM[UP,DOC] blob sn#466203 filedate 1979-08-17 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00005 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002		MORE DETAILS ON MERGE FILES
C00014 00003		STILL MORE DETAILS
C00016 00004		A DIRECTORY READING SUBROUTINE
C00020 00005		MORE DETAILED USAGE
C00027 ENDMK
C⊗;
	MORE DETAILS ON MERGE FILES

		DSKB:MRGPAK.HDR[MA,JAM]
		DSKB:MRGPAK.SAI[MA,JAM]

MRGPAK.HDR gives the declarations and brief explainations for the merge
file routines, and MRGPAK.SAI is the actual code therefor.

A merge file is a binary file that has two parts: a directory containing
8-word entries for each function in the file, and the data part with
each function packed snugly up against the next. The directory format
is as follows:

0	sixbit /NAME/		Name and extension of the function
1	sixbit /EXT/		(EXT can be up to 6 chrs too)
2	<wdno 1:128>,,<recno 1:N>	Word and record of data location
3	<total #wds>,,<last rec>	Word count and last record (redundant)
4	clock		(I) Sampling rate
5	comprs		(I) Compression, like 1, 16, or 32
6	anyval		(R) function-dependent value
7	anyval		(R) function-dependent value

The data itself is then located by the <wdno>,,<recno> word in the directory
portion, and is #wds long. This probably needs some more explaination. The
name and extension are just like disk file names and extensions, except that
the extension can be up to 6 characters long. The records of the file
are numbered starting at 1, which is the beginning of the directory. The
directory is terminated by an entry with a zero name (word 0 of the entry).
The routines for manipulating merge files only look at the first 4 words
of the directory entry: (name and retrieval information). The remaining
four words are strictly for the user's benefit.

Most of the time, you really don't need to know all this. You only need
to know about how to use the routines in MRGPAK. (See MRGPAK.HDR[MA,JAM]
for a list of the names of the routines, see MRGPAK.SAI[MA,JAM] for
the code).

CREATING MERGE FILES:

You can bring a merge file into existance just by writing 2 or more
blocks of zeros, then calling APPEND to add a new function to the file.
For instance, the following code will do exactly that:

	begin "ZEROS"
	    real array zero[1:512];
	    open(1,"DSK",'17,0,0,200,brk,eof);
	    enter(1,"NEW.MRG",fail);
	    if fail then usererr(0,0,"Help?!?!??");
	    arryout(1,zero[1],512);
	    close(1);
	    release(1);
	end "ZEROS";
	open(1,"DSK",'17,0,0,200,brk,eof);
	lookup(1,"NEW.MRG",fail);
	if fail then usererr(0,0,"It went away???");

This gives you a nice clean virgin merge file. Note several things here:
for all the MRGPAK routines to work right, the file must be opened in
mode '17, binary dump mode.

After this, you may write a function into the new file with the following
routine:

    External procedure APPEND(integer chan; string Fname;
	integer clock,comprs,wd1,wd2,locX,nwds);

Assuming that a merge file exists and has been opened on channel CHAN
in mode '17, you can append a new function to it with this procedure.
The meaning usually assigned to CLOCK and COMPRS is that CLOCK represents
the original sampling rate and COMPRS represents any decimation in time
that may have taken place in the computation of this function, so that
the sampling rate of this function is in fact CLOCK/COMPRS and that
it represents a duration of time of NWDS*COMPRS/CLOCK seconds. The
variables WD1 and WD2 are copied faithfully without interpretation.
LOCX is the location of the first data word of the data array minus
one, and NWDS is the number of valid data words of the file. For instance,
a sample call might be this:

	begin "CR"
	    real array X[1:368];
	    for i←1 step 1 until 368 do
		X[i]←sin(2*π*(i-1)/368);
	    append(1,"SINE.WAVE",25600,1,0,0,location(X[1])-1,368);
	end "CR";

Note that we subtracted one from the address of the first data word, and
that the extension is not limited to 3 characters. If the function is
actually at the sampling rate, COMPRS should be 1.

Retrieving the routine presents a tiny problem in that you don't necessarily
know how long the function is before you ask for it. For this purpose, there
are several directory search programs that can be used.


    External boolean procedure DIRGET(integer chan; string file;
	reference integer clock,comprs,wd1,wd2,nwds);

You call this (with, as usual, the merge file open on channel CHAN in
mode '17) and the file name as a string, and it returns TRUE if the
function exists in this merge file, and it returns you the relevant data
words, including the length of the function. If it returns FALSE, the
function was not found. You can then allocate an array of that length
and call the following routine:

    External boolean procedure EXTRACT(integer chan; string Fname; integer locX;
	reference integer nwds);

You give it the file name and a channel and the location of the word
before the first data word of the file, and it reads the data and returns
you the number of words read in NWDS. This better correspond with the
number you got from DIRGET or there is a bug. In addition, it returns
FALSE upon a successful read and TRUE if there has been any error. If you
got a successful return from DIRGET (Note that the sense of the returned
booleans is opposite between these two routines), then you better get a
successful routine from this one or there is a bug.

There is a further facility for inserting text commentary into the
functions:


    External string procedure GETCOM(integer chan);

    External string procedure FCOM(integer chan; string cname;
	reference integer time,date);

    External procedure RCOM(integer chan; string cname,cnew);

This is a system for putting commentary into the merge file. The
way it works is that it creates a binary file called COMMEN.TXT.
It then takes a comment from you (called CNEW) and an identifier
(called CNAME above) and inserts it in COMMEN.TXT along with the
date and the time. It synthesizes the date and time all by itself,
you just give it a naked string. You use RCOM to replace (or insert
for the first time) a new comment and FCOM to read back a comment
associated with a given identifier. Note that there can only be
one comment associated with a given identifier at a time. GETCOM
returns you a string that is all the commentary that exists in
the COMMEN.TXT function in the merge file. For the curious of heart,
the format of the text is as follows:

    ∂CNAME  Date    Time
    <comment>

    ∂NEXTCNAME	Date	Time
    <nextcomment>

    . . .

The only restriction on the text of the comment or its identifying name
is that they must not contain the character delta (∂), because this
is used to separate the characters.

As an example, assuming we have a merge file open on channel 1 in mode
'17, we can insert a comment under, for instance, the name PITCH as
follows:

	rcom(1,"PITCH","N=128, del=0.05, and temp=TRUE");

And we can get that comment back, along with the time and date (in
PDP-10 system time/date format) as follows:

	comm←fcom(1,"PITCH",tim,dat);

If you give another RCOM with the identifying name PITCH, it will replace
the old comment with the new one, so that FCOM always gets only the most
recent comment put in by RCOM.
	STILL MORE DETAILS

If you are dealing with large merge files, you will find that often
it is not possible to read an entire function into memory at a time.
What you really might want to do is to just read (or update) a few
words at a time. For this, what you can do is first get the directory
entry into memory (assuming the function already exists), and then
use RDFUN and UPDFUN to read (or write) little parts of it at a time.
The first object, though, is to get the directory into core. The format
we have chosen for an in-core directory entry is just the 8 directory
words supplemented by a 9th word which is the arithmetic sum of the
first 8. This checksum allows us to tell when a clobbered program is
trying to write on a merge file and provides a bit of extra safety.

The definitions and program on the next page will read in all the
functions of an entire merge file and make up a record pointer to
the file.
	A DIRECTORY READING SUBROUTINE

record_class MFUN (record_pointer(mfun) next,last;
		record_pointer(any_class) mergefile;
		string name;
		real maxval,minval;
		integer Np;
		integer name6,ext6,wdrec,wclr,
		    clock,comprs,userw1,userw2,cksum);

record_class MERGEFILE (record_pointer(mergefile) next,last;
		record_pointer(mfun) functions;
		string dev,file;
		integer chan);

record_pointer(mergefile) procedure GET_MERGE(integer chan; string dev,file);
begin
    integer ind,word,fcnt;
    record_pointer(mfun) fun,lfun;
    record_pointer(mergefile) head;

    open(chan,dev,'10,2,0,200,brk,eof);
    lookup(chan,file,fail);
    if fail then
    begin "FNF"
	release(chan);
	return(null_record);
    end "FNF";

    fcnt←0;
    head←new_record(mergefile);
    mergefile:dev[head]←dev;
    mergefile:file[head]←file;
    mergefile:chan[head]←chan;

    lfun←null_record;
    while (word←wordin(chan)) neq 0 do
    begin "BLDM"
	fun←new_record(mfun);
	mfun:name6[fun]←word;
	mfun:ext6[fun]←wordin(chan);
	mfun:wdrec[fun]←wordin(chan);
	mfun:wclr[fun]←wordin(chan);
	mfun:clock[fun]←wordin(chan);
	mfun:comprs[fun]←wordin(chan);
	mfun:userw1[fun]←wordin(chan);
	mfun:userw2[fun]←wordin(chan);

	mfun:cksum[fun]←
		mfun:name6[fun] + mfun:ext6[fun] + mfun:wdrec[fun] +
		mfun:wclr[fun] + mfun:clock[fun] + mfun:comprs[fun] +
		mfun:userw1[fun] + mfun:userw2[fun];
	mfun:name[fun]←cv6str(mfun:name6[fun])&"."&
			cv6str(mfun:ext6[fun]);
	mfun:Np[fun]←mfun:wclr[fun] lsh -18;
	mfun:mergefile[fun]←head;
	if lfun=null_record then
	    mergefile:functions[head]←fun
	else mfun:next[lfun]←fun;
	mfun:last[fun]←lfun;
	lfun←fun;
	fcnt←fcnt+1;
    end "BLDM";
    lfun←fun←null_record;
    outstr("GET MERGE: "&cvs(fcnt)&" functions found"&crlf);

    close(chan);
    release(chan);

    open(chan,dev,'17,0,0,200,brk,eof);
    lookup(chan,file,fail);
    if fail then
	usererr(0,0,"#??!? MERGE FILE WENT AWAY???");

    return(head);
end;
	MORE DETAILED USAGE

Once you have called GET_MERGE, then you can look for a given function
with a little loop like the following:

    fn←mergefile:functions[mfile];
    while fn neq null_record do
    begin "FNTM"
	if equ(template,mfun:name[fn]) then
	begin "FND1"
Comment Do whatever you want to with the function here ;
	    done "FNTM";
	end "FND1";
	fn←mfun:next[fn];
    end "FNTM";

This looks for a function by the name of TEMPLATE and enters the
block FND1 when it finds one. Once you have found one, you may
read, for instance, a part of it using RDFUN:

    External boolean procedure RDFUN(integer chan; integer dloc,
	    wstart,locX,nwds);

This routine reads the function specified by the directory whose address
is in DLOC. It only reads a part of it. It reads the NWDS words starting
at word WSTART (the first word of the function is counted as word zero).
This routine cannot read past functions, so it returns TRUE (error) if you
ask it to read past the end of the file. NOTE that in this case, LOCX
points to the actual first word of data (unlike before where it pointed to
the word before). In our example above, if you then wanted to read through
the function 100 words at a time, you might put the following code
inside FND1:

	begin "FND1"
	    for i←0 step 100 until mfun:Np[fn] do
	    begin "RDIT"
		real array X[1:100];
		if rdfun(1,location(mfun:name6[fn]),i,location(X[1]),100)
		    then usererr(0,0,"Function dissapeared");
		Comment ** munch the function here or whatever **;
	    end "RDIT";
	end "FND1";

Since NAME6 is the first word of the (checksummed) directory entry, we
give its location to RDFUN as the second parameter. In this little
example, we use I to set the starting word within the function to
read (the first word is number 0). Likewise, we can use UPDFUN to overwrite
a little portion of the function:

    External boolean procedure UPDFUN(integer chan; integer dloc,
	    wstart,locX,nwds);

It will return TRUE if any error is detected, such as if the checksum
does not correspond. With both these routines, any attempt to read
or write out of the bounds of the function will result in no-operations
for those words that are out of bounds. If you are reading, references
outside the bounds will be supplied with zeros. If you are writing, the
words out of the bounds will not be written.

This brings us naturally to another question: how do those functions
get there in the first place? We have seen one way, which is to use
APPEND to tack functions together, but there is a quicker way to
create a complete file full of functions with all zero data, and that
is to use CRMRG:

    External procedure CRMRG(integer chan; string dev,file;
	    integer array names,exts,lens,clks,cmps,wd1s,wd2s;
	    integer nfiles);

Creates an empty merge file with NFILES functions in it with
names, extensions and lengths in the given arrays. Those arrays start
at 1. Does whole process, OPENS, writes, and CLOSEs the file, so the
channel has been released when it returns. You will have to reopen
it when it comes back. Notice that the NAMES and EXTS arrays are
integers, so they must already be in SIXBIT form. As an example, if
we want to create a merge file with three functions in it called
FUNCT1, FUNCT2.EXT, and FUNCT3, all with lengths of 100, we might
do the following:

	begin "CRM"
	    integer array names,exts,lens,clks,cmps,wd1s,wd2s[1:3];

	    names[1]←cvsix("FUNCT1");
	    names[2]←cvsix("FUNCT2");
	    exts[2]←cvsix("EXT");
	    names[3]←cvsix("FUNCT3");
	    lens[1]←100;
	    lens[2]←100;
	    lens[3]←100;
	    crmrg(1,"DSK","NEW.MRG",names,exts,lens,clks,cmps,wd1s,wd2s,3);
	end "CRM";

This will create the file, but with the clock rates, compressions, and
user-words all set to zero (because we didn't bother to set them above).
After this, you will have to OPEN and LOOKUP the file again (in mode
'17), and probably you will want to use GET_MERGE to read in the
locations of your new functions, so that you may call UPDFUN to write
into them.